32 research outputs found

    PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures

    Full text link
    Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In this work, we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets

    Comprendre la Topologie et la Géométrie de l'Espace des Diagrammes de Persistance en utilisant le Transport Optimal

    Get PDF
    International audienceDespite the obvious similarities between the metrics used in topological data analysis and those of optimal transport, an optimal-transport based formalism to study persistence diagrams and similar topological descriptors has yet to come. In this article, by considering the space of persistence diagrams as a space of discrete measures, and by observing that its metrics can be expressed as optimal partial transport problems, we introduce a generalization of persistence diagrams, namely Radon measures supported on the upper half plane. Such measures naturally appear in topological data analysis when considering continuous representations of persistence diagrams (e.g.\ persistence surfaces) but also as limits for laws of large numbers on persistence diagrams or as expectations of probability distributions on the persistence diagrams space. We explore topological properties of this new space, which will also hold for the closed subspace of persistence diagrams. New results include a characterization of convergence with respect to Wasserstein metrics, a geometric description of barycenters (Fr\'echet means) for any distribution of diagrams, and an exhaustive description of continuous linear representations of persistence diagrams. We also showcase the strength of this framework to study random persistence diagrams by providing several statistical results made meaningful thanks to this new formalism

    Estimation and Quantization of Expected Persistence Diagrams

    Get PDF
    International audiencePersistence diagrams (PDs) are the most common descriptors used to encode the topology of structured data appearing in challenging learning tasks; think e.g. of graphs, time series or point clouds sampled close to a manifold. Given random objects and the corresponding distribution of PDs, one may want to build a statistical summary-such as a mean-of these random PDs, which is however not a trivial task as the natural geometry of the space of PDs is not linear. In this article, we study two such summaries, the Expected Persistence Diagram (EPD), and its quantization. The EPD is a measure supported on R 2 , which may be approximated by its empirical counterpart. We prove that this estimator is optimal from a minimax standpoint on a large class of models with a parametric rate of convergence. The empirical EPD is simple and efficient to compute, but possibly has a very large support, hindering its use in practice. To overcome this issue, we propose an algorithm to compute a quantization of the empirical EPD, a measure with small support which is shown to approximate with near-optimal rates a quantization of the theoretical EPD

    Topological Uncertainty: Monitoring trained neural networks through persistence of activation graphs

    Get PDF
    International audienceAlthough neural networks are capable of reaching astonishing performances on a wide variety of contexts, properly training networks on complicated tasks requires expertise and can be expensive from a computational perspective. In industrial applications, data coming from an open-world setting might widely differ from the benchmark datasets on which a network was trained. Being able to monitor the presence of such variations without retraining the network is of crucial importance. In this article, we develop a method to monitor trained neural networks based on the topological properties of their activation graphs. To each new observation, we assign a Topological Uncertainty, a score that aims to assess the reliability of the predictions by investigating the whole network instead of its final layer only, as typically done by practitioners. Our approach entirely works at a post-training level and does not require any assumption on the network architecture, optimization scheme, nor the use of data augmentation or auxiliary datasets; and can be faithfully applied on a large range of network architectures and data types. We showcase experimentally the potential of Topological Uncertainty in the context of trained network selection, Out-Of-Distribution detection, and shift-detection, both on synthetic and real datasets of images and graphs

    On the Existence of Monge Maps for the Gromov-Wasserstein Problem

    No full text
    In this work, we study the structure of minimizers of the quadratic Gromov--Wasserstein (GW) problem on Euclidean spaces for two different costs. The first one is the scalar product for which we prove that it is always possible to find optimizers as Monge maps and we detail the structure of such optimal maps. The second cost is the squared Euclidean distance for which we show that the worst case scenario is the existence of a bi-map structure. Both results are direct and indirect consequences of an existence result of optimal maps in the standard optimal transportation problem for costs that are defined by submersions. In dimension one for the squared Euclidean distance, we show numerical evidence for a negative answer to the existence of a Monge map under the conditions of Brenier's theorem, suggesting that our result cannot be improved in general. In addition, we show that a monotone map is optimal in some non-symmetric situations, thereby giving insight on why such a map often appears to be optimal in numerical experiments

    Climate Science at the Interface Between Topological Data Analysis and Dynamical Systems Theory

    No full text
    The authors are hosting an AMS sponsored Mathematics Research Community (MRC) on novel applications of topological data analysis (TDA) and dynamical systems theory to the study of climate change and weather forecasting. In this Notices article we introduce some of the big challenges in climate science, and describe how methods from TDA and dynamical systems theory can help tackle these. We hope to encourage applications to the MRC from mathematicians with a background in applied algebraic topology and scientists working on climate or meteorology, both from academia and industry

    A Gradient Sampling Algorithm for Stratified Maps with Applications to Topological Data Analysis

    No full text
    We introduce a novel gradient descent algorithm extending the well-known Gradient Sampling methodology to the class of stratifiably smooth objective functions, which are defined as locally Lipschitz functions that are smooth on some regular pieces-called the strata-of the ambient Euclidean space. For this class of functions, our algorithm achieves a sub-linear convergence rate. We then apply our method to objective functions based on the (extended) persistent homology map computed over lower-star filters, which is a central tool of Topological Data Analysis. For this, we propose an efficient exploration of the corresponding stratification by using the Cayley graph of the permutation group. Finally, we provide benchmark and novel topological optimization problems, in order to demonstrate the utility and applicability of our framework
    corecore